Google recently released a new version of its Gemma3 series, exciting many AI enthusiasts. Just a month after its initial launch, Google released a Quantization Aware Training (QAT) optimized version of Gemma3, aiming to significantly reduce memory requirements while maintaining model quality. Specifically, the QAT-optimized Gemma3 27B model reduces VRAM requirements from 54GB to 14.1GB, meaning users can now run it on a single NVIDIA RTX 3090.